Direct prediction of antimicrobial resistance in Pseudomonas aeruginosa by metagenomic next-generation sequencing

Objective Pseudomonas aeruginosa has strong drug resistance and can tolerate a variety of antibiotics, which is a major problem in the management of antibiotic-resistant infections. Direct prediction of multi-drug resistance (MDR) resistance phenotypes of P. aeruginosa isolates and clinical samples by genotype is helpful for timely antibiotic treatment. Methods In the study, whole genome sequencing (WGS) data of 494 P. aeruginosa isolates were used to screen key anti-microbial resistance (AMR)-associated genes related to imipenem (IPM), meropenem (MEM), piperacillin/tazobactam (TZP), and levofloxacin (LVFX) resistance in P. aeruginosa by comparing genes with copy number differences between resistance and sensitive strains. Subsequently, for the direct prediction of the resistance of P. aeruginosa to four antibiotics by the AMR-associated features screened, we collected 74 P. aeruginosa positive sputum samples to sequence by metagenomics next-generation sequencing (mNGS), of which 1 sample with low quality was eliminated. Then, we constructed the resistance prediction model. Results We identified 93, 88, 80, 140 AMR-associated features for IPM, MEM, TZP, and LVFX resistance in P. aeruginosa. The relative abundance of AMR-associated genes was obtained by matching mNGS and WGS data. The top 20 features with importance degree for IPM, MEM, TZP, and LVFX resistance were used to model, respectively. Then, we used the random forest algorithm to construct resistance prediction models of P. aeruginosa, in which the areas under the curves of the IPM, MEM, TZP, and LVFX resistance prediction models were all greater than 0.8, suggesting these resistance prediction models had good performance. Conclusion In summary, mNGS can predict the resistance of P. aeruginosa by directly detecting AMR-associated genes, which provides a reference for rapid clinical detection of drug resistance of pathogenic bacteria.


Introduction
Pseudomonas aeruginosa is a Gram-negative opportunistic bacterium that causes a variety of acute and chronic infections in humans.Due to its increasing incidence, high treatment difficulty, and high fatality rate, it is a major problem and focus of clinical treatment (Fernández-Barat et al., 2017;Reynolds and Kollef, 2021).The global burden report of bacterial anti-microbial resistance (AMR) in 2019 pointed out that AMR poses a major threat to human health worldwide, especially the resistance of P. aeruginosa to multiple antibiotics (Murray et al., 2022).With a relatively large bacterial genome, P. aeruginosa has good tolerance and adaptability to various environments, and has natural resistance to a variety of antibiotics (Klockgether et al., 2011), among which difficult-to-treat resistance P. aeruginosa (DTR-PA) urgently needs research and development of new antibiotics to deal with opportunistic infections caused by DTR-PA (Qin et al., 2022).
Rapid detection and elucidation of resistance mechanisms are essential for timely antibiotic treatment and monitoring of multi-drug resistance (MDR) P. aeruginosa.Nowadays, whole genome sequencing (WGS) has become an advanced method for detecting AMR (Maladan et al., 2021), while the clinical application of WGS is limited by high cost, high sample requirements, and technical analytical hurdles.The application of antibiotic susceptibility testing (AST) in the diagnosis of antimicrobial-resistant pathogens and their antibiogram is timeconsuming, cumbersome operation and has a low positive rate, which cannot meet the clinical needs (Shanmugakani et al., 2020;Hu and Zhao, 2023).Recently, metagenomics next-generation sequencing (mNGS) technology enables the identification of pathogens and AMR-associated genes directly from clinical samples based on its ability of the rapid diagnosis of unexplained infections (Yan et al., 2021;Liu et al., 2023).Therefore, considering the complex resistance mechanisms and the high prevalence of variant-driven resistance of DTR-PA, it is feasible to predict resistance phenotypes of DTR-PA by detecting AMR-associated genes via mNGS.
In this study, mNGS was used to directly predict resistance to multiple antibiotics in P. aeruginosa, such as imipenem (IPM), meropenem (MEM), piperacillin/tazobactam (TZP), and levofloxacin (LVFX).First, AMR-associated genes of P. aeruginosa were screened according to the resistance genotype and phenotype of WGS and AST data.Then, we matched the AMR-associated genes from mNGS data of P. aeruginosa in clinical samples with the above genes to obtain the relative abundance of genes.Finally, the direct resistance prediction models for P. aeruginosa were constructed using random forest (RF) based on the relative abundance of AMR-associated genes.

Genome and AMR phenotype collection of Pseudomonas aeruginosa strains
The NCBI NDARO database 1 and reference (Liu et al., 2023) were performed for searching P. aeruginosa strains with both whole genome 1 https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/sequences and unambiguous resistance phenotypes to IPM, MEM, TZP, and LVFX.Referencing the previous research (Hu and Zhao, 2023), low quality genomes were eliminated using a customized criterion developed with reference to NCBI genome exclusion rules.See the Supplementary Figure S1 for the filtering rules.After this filtering step, a total of 494 P. aeruginosa whole genomes were obtained, including 394 P. aeruginosa strains from NCBI NDARO database and 100 P. aeruginosa strains from the referenced study (Liu et al., 2023).Detailed information on the genomes of P. aeruginosa strains is displayed in Supplementary Table S1.

Curation of the Pseudomonas aeruginosa reference database and mapping
Prodigal v2.6.3 was used to predict the genes of the collected genome, and compare the predicted genes with eggNOG-mapper v2 (Cantalapiedra et al., 2021) to get the gene name; finally the gene copy number was obtained from knowing number of gene names in bacterial genome (Supplementary Table S2).Subsequently, R language function t.test was applied to perform differential analysis on the resistance and sensitive groups of the same antibiotic.The obtained p values were corrected to q values using the R language function p.adjust (holm method).Genes with q-values <0.05 were regard as AMR-associated genes.

Patients, samples, and processing
A total of 74 P. aeruginosa positive sputum samples from 59 clinical patients were collected retrospectively from Peking University Shenzhen Hospital from February 2023 to September 2023.Our study has obtained the oral informed consent from all subjects, and approved by the Medical Ethics Committee of Peking University Shenzhen Hospital.

Selection of machine learning method
To obtain an appropriate machine learning method, we compared the performance of prediction models for the MEM resistance constructed by RF, logistic regression, and SVM.The results showed that the prediction models built by RF had the best performance of the prediction models with the maximum area under the curve (AUC > 0.85 in both the test cohort and training cohort) (Figure 2).Therefore, RF was performed for the construction of prediction models.

FIGURE 1
The flowchart of this study.

Direct prediction of antibiotic resistance for Pseudomonas aeruginosa by mNGS
To directly predict resistance of P. aeruginosa to IPM, MEM, TZP, and LVFX from clinical specimens, we evaluated the applicability of mNGS to detect key AMR-associated genes.We collected 74 P. aeruginosa positive sputum samples and sent them for mNGS sequencing, of which one sample was eliminated due to quality problems.Subsequently, the AMR-associated genes in mNGS data were mapped with AMR-associated genes selected by machine learning to obtain their relative abundance.Then, according to the RF algorithms, we divided the samples into a test cohort and a training cohort in a 1:1 ratio to create the resistance prediction models for IPM, MEM, TZP, and LVFX, respectively.In the test cohort, the AUC of the IPM resistance prediction model was 0.885, with a sensitivity of 0.741 and a specificity of 0.926 (Figure 3A).The AUC of the MEM resistance prediction model was 0.857, with a sensitivity of 0.667 and a specificity of 0.889 (Figure 3C).The AUC of TZP and LVFX reached 0.823 (a specificity of 0.7 and sensitivity of 0.95), 0.848 (a specificity of 1 and a sensitivity of 0.682), respectively (Figures 3E,G).In the training set, the AUCs of the IPM, MEM, TZP, and LVFX resistance prediction models were 0.893, 0.888, 0.894, and 0.896, respectively (Figures 3B,D,F,H).Taken together, the resistance prediction models for IPM, MEM, TZP, and LVFX based on mNGS sequencing had good diagnostic performance.

Comparison of the relative abundance of the resistance genes between resistance and sensitive samples
We further compared the relative abundance of the resistance genes between resistance samples and sensitive samples.Supplementary Figures S3-S6 indicated the relative abundance of 20 AMR-associated genes between resistance samples and sensitive samples from IPM, MEM, TZP, and LVFX, respectively.The violin diagrams showed the relative abundance of the top three genes with most significant difference in resistance samples and sensitive samples (Figures 4A-D).For example, for the IPM resistance prediction model, the abundance of merE, tniQ, and mmIH in resistant samples was higher than in sensitive samples (p < 0.05, Figure 4A), suggesting that these genes were positively associated with IPM resistance.For the MEM resistance prediction model, the abundance of aadA4, rhsA, and tniR in resistant samples was higher than in sensitive samples (p < 0.05, Figure 4B), indicating these genes were positively associated with MEM resistance.For the TZP resistance prediction model, compared with the sensitive samples, the abundance of mmH and MA20-16375 was increased in resistant samples (p < 0.05, Figure 4C).Similarly, the abundance of fabH, IspA, and rfbE was higher in resistant samples (p < 0.05, Figure 4D) in the LVFX resistance prediction model, comparing to sensitive samples.These findings demonstrated the relative abundance of the AMR-associated genes between resistance samples and sensitive samples could influence the resistance or sensitivity of P. aeruginosa to antibiotics.

Discussion
In recent years, due to the abuse of antibiotics, the resistance of P. aeruginosa has increased greatly, resulting in MDR and extensive drug resistance strains often appear in clinical treatment, which brings great difficulties to the clinical treatment of patients (Lister et al., 2009;Chegini et al., 2020;Tenover et al., 2022).However, the complex mechanism of drug resistance hindered genotype-to-phenotype prediction of P. aeruginosa (Poole, 2002;Fajardo et al., 2008).In view of this, we explored the applicability of WGS and mNGS in the direct prediction of IPM, MEM, TZP, and LVFX resistance of P. aeruginosa.Briefly, using the available WGS data of P. aeruginosa, 20 AMR-associated genes with IPM, MEM, TZP, and LVFX resistance were identified, respectively.Subsequently, we constructed IPM, MEM, TZP, and LVFX resistance prediction models based on the results of mNGS sequencing.The AUCs of IPM, MEM, TZP, and LVFX resistance prediction models were all greater than 0.8, indicating that our prediction models had good performance in predicting resistance of P. aeruginosa to antibiotics.
Several previous studies have explored the performance of WGS in predicting P. aeruginosa resistance phenotypes (Khaledi and Weimann, 2020;Kim and Greenberg, 2020;Cortes-Lara et al., 2021).Most of these studies focused on DNA sequences of gene presence or absence and gene variant (Hyun and Kavvas, 2020;Kim and Greenberg, 2020;Cortes-Lara et al., 2021).However, the modelbuilding processes of the models constructed are cumbersome.Additionally, the predictive value of AMR-related gene signatures in P. aeruginosa needs to be further verified.The drug resistance mechanism of P. aeruginosa mainly involved the outer membrane channel protein OprD gene (Chevalier et al., 2017), the aminoglycoside modifying enzyme gene (Thacharodi and Lamont, 2022), the β-lactam coding gene (Doss et al., 2004), and the 16S rRNA methylase gene (Yokoyama et al., 2003).Liu et al. (2023) have constructed the resistance prediction models for P. aeruginosa by detecting deletion or mutation sites of these genes.In our study, genes with significant copy number differences between resistant strains and sensitive strains were regarded as candidate AMR-associated genes associated with IPM/ MEM/TZP/LVFX resistance.The candidate AMR-associated genes were used to build corresponding resistance prediction models by RF algorithms.For example, AMR-associated genes in TZP resistance prediction model are ymdF, stbD, hcnA, yebS, etc.The ymdF played a role in flagellum-dependent motility regulation of P. aeruginosa (Oguri et al., 2019).Flagellum motility plays an active role in many biological functions of bacteria, such as the formation of bacteria-host symbiosis, pathogenicity, and antibiotic resistance (Raina et al., 2019;Wadhwa and Berg, 2022).In the pathogenic process of Acinetobacter baumannii, flagellar dysfunction can significantly reduce its virulence (Corral et al., 2021).Moreover, ymdF as PA2146 homologs contributes to biofilm formation and drug tolerance in Escherichia coli, Klebsiella pneumoniae, and P. aeruginosa (Kaleta et al., 2022;Kaleta and Sauer, 2023), indicating ymdF plays an important role in TZP resistance for P. aeruginosa.In addition, stbD is a key component of IncFIB-4.1/4.2 single-replicon plasmids that influences bacterial resistance to antibiotics (Xu et al., 2022).The toxin-antitoxin system is widely present in pathogenic microorganisms, which promotes the formation of MDR bacteria by regulating several important cellular processes in cells (Bernard and Couturier, 1992;Lewis, 2010).stbD/E-pEP36 has reported to be a functional toxin-antitoxin hybrid module (Unterholzner et al., 2013), indicating stbD/E-pEP36 contributed to P. aeruginosa resistance to TZP.The high pathogenicity of P. aeruginosa is attributed to its production of multiple virulence factors and its resistance to several antimicrobials, among which sodium hypochlorite (NaOCl) is widely used because of its strong antibacterial effect.However, hydrogen cyanide derived from hcnA acts as a scavenger molecule that can quench the toxic effects of NaOCl, thereby contributing to P. aeruginosa resistance to NaOCl (da Cruz Nizer et al., 2023).The above information indicated that these AMR-associated genes were significant for P. aeruginosa resistance to TZP.Besides, uspA is a common AMR-associated gene in IPM and MEM resistance models.Universal stress proteins (USPs) are generally overexpressed in a variety of pathogens under various environmental stresses, among which uspA has been identified as a potential drug target against MDR-E.coli (Kvint et al., 2003).Furthermore, inhibition  (Castillo and Pérez, 2008).In our study, fabH is an AMR-associated genes with high contribution in LVFX resistance model.FabH is a key enzyme responsible for fatty acid biosynthesis, which is essential for many pathogens (Yuan et al., 2012).According to research findings, the activity ratio of fabH/fabF was the main determinant of antibiotic susceptibility of pathogens (Parsons et al., 2015), suggesting that focusing on the fabH/fabF activity ratio in P. aeruginosa could predict resistance to LVFX.Different from other drug resistance models (Khaledi and Weimann, 2020;Hu Zhao, 2023;Liu et al., 2023), we have screened AMR-associated genes by comparing the copy number of genes in the antibiotic resistance group and the sensitive group.Metagenomic studies have shown that copy number variation in the human microbiome is common and affects human health (Greenblum et al., 2015;Zeevi et al., 2019).The study have found that an increase in the copy number of antibiotic resistance genes (ARGs) on the multi-copy plasmid promoted the high expression of ARGs, thereby increasing drug resistance (Yao et al., 2022).Several recent studies have reported the relationship between gene duplication and bacterial resistance in clinical antibiotic-resistant strains by measuring copy number of resistant gene (Duvernay et al., 2011;McGann et al., 2014;Chirakul et al., 2019;Anderson et al., 2020).Besides, a new research demonstrated that ARG duplication could be an effective mechanism for the evolution of antibiotic resistance (Maddamsetti et al., 2024).These above studies indicated that measuring copy number changes of ARG in antibiotic-resistant isolates is very important for studying microbial resistance.In our study, AMR-associated genes in P. aeruginosa were identified by comparing genes with copy number differences between resistance strains and sensitive strains.Subsequently, four resistance models constructed by AMR-associated genes had good performance, which provide a new molecular resistance analysis tool for predicting antimicrobial resistance.
The detection of antimicrobial resistance mainly relies on the traditional drug susceptibility test, which has some shortcomings such as time-consuming and inappropriate empirical treatment.Therefore, a more rapid antibiotic sensitivity test is urgently needed to effectively reduce antibiotic resistance and improve clinical treatment.Given the widespread use of mNGS and its ability to directly identify AMR-associated genes from a variety of clinical samples, mNGS holds  Cao et al. 10.3389/fmicb.2024.1413434Frontiers in Microbiology 07 frontiersin.orggreat potential for rapid detection of AMR (Hoang et al., 2021;Ruppé et al., 2022).Hu and Zhao (2023) reported that the average reporting time of the NGS-based AST for clinical samples was 19.1 h, which effectively shortened the reporting time of traditional drug sensitivity detection technology.In our study, we identified AMR-associated genes in P. aeruginosa by comparing genes with copy number differences between resistance strains and sensitive strains.Then, P. aeruginosa positive sputum samples were sequenced by mNGS method, and AMR-associated genes in mNGS were mapped with AMR-associated genes selected by machine learning to obtain their relative abundance, thereby constructing the resistance prediction models with great performance (AUCs >0.8).Here, we have effectively demonstrated the great applicability of mNGS in the prediction of P. aeruginosa to four commonly used antibiotics (IPM, MEM, TZP, and LVFX) based on key AMR-associated features selected and prediction models constructed.
There are certain challenges with the mNGS method.First, the high cost of mNGS is a great challenge for the future extensive application of mNGS.Second, the comprehensive application of mNGS is limited by high host DNA content, nucleic acid contamination, complex interpretation of mNGS data, and the depth of sequencing.Third, as a new detection method, the positive result of mNGS represents the nucleic acid fragment of a certain pathogen detected in clinical specimens.

Conclusion
In this study, we first proposed genes with significant copy number differences between resistant strains and sensitive strains were associated with resistance of P. aeruginosa.These key AMR-associated genes and the prediction models can be used for mNGS to directly predict the resistance of P. aeruginosa in clinical samples, which have guiding significance for the clinical management of DTR-PA.

FIGURE 2
FIGURE 2Performance of prediction models constructed by different machine learning methods.

FIGURE 3
FIGURE 3 The resistance prediction models of Pseudomonas aeruginosa.The IPM resistance prediction of P. aeruginosa in the test set (A) and the training set (B).The MEM resistance prediction of P. aeruginosa in the test set (C) and the training set (D).The TZP resistance prediction of P. aeruginosa in the test set (E) and the training set (F).The LVFX resistance prediction of P. aeruginosa in the test set (G) and the training set (H).

FIGURE 4
FIGURE 4Comparison of the relative abundance of top three AMR-associated genes for the IPM (A), MEM (B), TZP (C), and LVFX (D) resistance of Pseudomonas aeruginosa between resistant samples and sensitive samples.